UniNE at CLEF 2006: Experiments with Monolingual, Bilingual, Domain-Specific and Robust Retrieval
نویسندگان
چکیده
For our participation in this CLEF evaluation campaign, the first objective was to propose and evaluate various indexing and search strategies for the Hungarian language in order to produce better retrieval effectiveness than language-independent approach (n-gram). Using both a new stemmer including some derivational suffixes removals, and a more aggressive automatic decompounding scheme, we were able to produce better retrieval effectiveness than corresponding 4-gram indexing scheme. Our second objective was to obtain a better picture of the relative merit of various search engines with the French, Brazilian/Portuguese and Bulgarian languages. To do so we evaluated these test-collections using the Okapi, Divergence from Randomness (DFR) and language model (LM) models together with nine vector-processing approaches. After pseudorelevance feedback, either the DFR or the LM approach tends to produce the best IR performance. For the Bulgarian language, we also found that word-based indexing proposes usually better retrieval effectiveness than corresponding 4-gram indexing. In the bilingual track, we evaluated the effectiveness of various machine translation systems to automatically translate a query submitted in English into the French and Portuguese languages. After blind query expansion, the MAP achieved by the best single MT system is around 95% of the corresponding monolingual search when French is the target language, or 83% with the Portuguese. Using the GIRT corpora (available in German and English), we investigated variations in retrieval effectiveness when facing with domain-specific collection composed of relatively short bibliographic notices. Finally, in the robust retrieval task we investigated different techniques in order to improve the retrieval performance of difficult topics. In this track, we found that both the mean average precision and the geometric mean are strongly correlated. Moreover, massive query expansion based on a search engine did not provide better retrieval effectiveness than Rocchio’s approach.
منابع مشابه
REINA at CLEF 2006 Robust Task: Local Query Expansion Using Term Windows for Robust Retrieval
This paper describes our work at CLEF 2006 Robust task. This task is an ad-hoc task that explores methods for stable retrieval by focusing on poorly performing topics. We have realized experiments for all subtask: monolingual (EN, ES, FR and IT), bilingual (IT→ES) and multilingual (ES→[EN ES FR IT]) retrieval. For monolingual retrieval we have focused our work on local query expansion, i.e. usi...
متن کاملREINA at CLEF 2007 Robust Task
This paper describes our work at CLEF 2007 Robust Task. We have participated in the monolingual (English, French and Portuguese) and the bilingual (English to French) subtask. At CLEF 2006 our research group obtained very good results applying local query expansion using windows of terms in the robust task. This year we have used the same expansion technique, but taking into account some criter...
متن کاملReport of MIRACLE Team for the Ad-Hoc Track in CLEF 2006
This paper presents the 2006 MIRACLE’s team approach to the AdHoc Information Retrieval track. The experiments for this campaign keep on testing our IR approach. First, a baseline set of runs is obtained, including standard components: stemming, transforming, filtering, entities detection and extracting, and others. Then, a extended set of runs is obtained using several types of combinations of...
متن کاملUC Berkeley at CLEF 2003 - Russian Language Experiments and Domain-Specific Cross-Language Retrieval
As in the previous years, Berkeley’s group 1 experimented with the domain-specific CLEF collection GIRT as well as with Russian as query and document language. The GIRT collection was substantially extended this year and we were able to improve our retrieval results for the query languages German, English and Russian. For the GIRT retrieval experiments, we utilized our previous experiences by c...
متن کاملREINA at CLEF 2009 Robust-WSD Task: Partial Use of WSD Information for Retrieval
This paper describes the participation of the REINA research group at CLEF 2009 Robust-WSD Task. We have participated in both monolingual and bilingual subtasks. In past editions of the robust task our research group obtained very good results for non-WSD experiments applying local query expansion using co-occurrence based thesauri constructed using windows of terms. We applied it again. For WS...
متن کامل